Lab CudaVision - Lab Project

Learning Vision Systems on Graphics Cards (MA-INF 4308)

Salih MARANGOZ (s6samara)
Elif Cansu YILDIZ (s6efyild)

Table of Contents

Import Modules

Datasets

Extracting all dataset may use more than 100GB of space. This graph may help you to only extract which are needed. For example, we didn't use 35mm_focallength data for the driving dataset. Modify DATASET_ROOT if your dataset folder is on another location.

├── labvision_project_folder
│   └── ...
├── dataset
    ├── driving
    │   ├── disparity
    │   │   └── 15mm_focallength
    │   └── frames_cleanpass_webp
    │   │   └── 15mm_focallength
    ├── flyingthings3d
    │   ├── disparity
    │   │   ├── TEST
    │   │   └── TRAIN
    │   └── frames_cleanpass_webp
    │   │   ├── TEST
    │   │   └── TRAIN
    ├── kitti
    │   └── training
    │       ├── image_2
    │       ├── image_3
    │       └── disp_occ_0
    └── monkaa
        ├── disparity
        │   ├── a_rain_of_stones_x2
        │   └── ....
        └── frames_cleanpass_webp
            ├── a_rain_of_stones_x2
            └── ....

Transforms

We implemented *-Multi versions of some transforms which can process left image, right image and disparity map inputs. In these classes, while cropping operations done on all images, jittering and normalization only done on RGB images. We also added SanitizeImageSizesMulti for cropping data especially for KITTI dataset. This transform crops image and disparity data from left and right to match with target width, and only from top to match with target height (since KITTI has no disparity values on the top section of images).

Analyze Datasets

We analyzed histogram of disparity values and coverage for the specific maximum disparity parameter. We found 192 would be a good maximum disparity parameter.

Driving Dataset

Monkaa Dataset

Flyingthings3d Dataset

Kitti Dataset

Split/Concat Datasets

We splitted Scene Flow 95% for training and 5% for validation. For Kitti we splitted 150 samples for training/validation and 50 samples for testing, then splitted 150 samples as 80% for training and 20% for validation.

Dataloader

For training higher batch sizes can be set. Since test images are larger than training images we recommend using small batch sizes.

Training Device

Experiments

We created all_experiments.ipynb (and all_experiments.html) to keep this report clean. We show some code examples here to give insight about how we done our experiments systematically below. Also all experiments includes CELL.txt showing how the experiment was done.

Code Examples

For Pretraining:

Trains GCnet with 192 max disparity for 10 epoch on pretraining dataset.

model = GCnet(192).to(device)
e1 = t_utils.Experiment(name         = "GCnet-pretraining", 
                        description  = "Pretraining GCnet with maxdisp=192 for 10 epochs",
                        model        = model,
                        criterion    = nn.SmoothL1Loss(), 
                        scheduler    = None,
                        optimizer    = torch.optim.Adam(model.parameters(), lr=1e-4), 
                        train_loader = pretraining_train_dataloader, 
                        val_loader   = pretraining_val_dataloader, 
                        max_iter     = len(pretraining_train_dataset)*10,
                        val_interval = 2500,
                        vis_interval = 500,
                        save_interval= 5000,
                        device       = device)
e1.train_model()
e1.save()

For Finetuning:

Loads pretrained PSMNet model and finetunes for 20000 iterations.

checkpoint = torch.load("runs/PSM-pretraining-2021_09_22-08_31_27_192disp_10epoch_default/model_manual_save.pt")
model = PSMNet(192).to(device)
model.load_state_dict(checkpoint['model'])

e1 = t_utils.Experiment(name         = "PSM-finetuning", 
                        description  = "Finetuning PSMNet with maxdisp=192 with lr=1e-5 on Pretrained model with 10 epochs.",
                        model        = model,
                        criterion    = nn.SmoothL1Loss(), 
                        scheduler    = None,
                        optimizer    = torch.optim.Adam(model.parameters(), lr=1e-5), 
                        train_loader = finetuning_train_dataloader, 
                        val_loader   = finetuning_val_dataloader, 
                        max_iter     = 20000,
                        val_interval = 200,
                        vis_interval = 100,
                        save_interval= 500,
                        device       = device)
e1.train_model()
e1.save()

To Continue an Experiment:

We made it easy to interrupt and continue the training later. name, description, etc. parameters are loaded automatically.

model = PSMNet(192).to(device)
e1 = t_utils.Experiment(model        = model,
                        criterion    = nn.SmoothL1Loss(), 
                        scheduler    = None,
                        optimizer    = torch.optim.Adam(model.parameters(), lr=1e-4), 
                        train_loader = finetuning_train_dataloader, 
                        val_loader   = finetuning_val_dataloader, 
                        device       = device)
e1.load("runs/PSM-finetuning-2021_09_25-01_29_49_20000iter_lr1e-4/model_manual_save.pt")
e1.train_model()
e1.save()

To Continue an Experiment More:

Experiment parameters can be modified after loading.

e1.max_iter = 30000
e1.train_model()
e1.save()

Outputs

Full-PSMNet (finetuned with Scheduler) Outputs

GCNet Outputs

Compare Full-PSMNet & GCNet

Compare Scheduler Effect of Full-PSMNet

Training, Validation, Test Statistics

All evaluation statistics can be found in all_experiments.ipynb (and all_experiments.html). We seperated other experiments to keep this report clean.
Training/validation statistics and model outputs can be seen with Tensorboard. Try running Tensorboard with modified parameters like this: tensorboard --samples_per_plugin="scalar=10000,images=200" --logdir runs/

image alt

image alt

(BEST) Full-PSMNet finetuned with Scheduler

Full-PSMNet finetuned without Scheduler

GCNet

References